Overview

Dataset statistics

Number of variables22
Number of observations2109
Missing cells0
Missing cells (%)0.0%
Duplicate rows897
Duplicate rows (%)42.5%
Total size in memory348.2 KiB
Average record size in memory169.1 B

Variable types

NUM21
BOOL1

Reproduction

Analysis started2020-03-22 12:06:06.678954
Analysis finished2020-03-22 12:07:48.426962
Versionpandas-profiling v2.5.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Dataset has 897 (42.5%) duplicate rows Duplicates
v(g) is highly correlated with loc and 7 other fieldsHigh Correlation
loc is highly correlated with v(g) and 8 other fieldsHigh Correlation
iv(g) is highly correlated with v(g) and 3 other fieldsHigh Correlation
n is highly correlated with loc and 10 other fieldsHigh Correlation
v is highly correlated with loc and 10 other fieldsHigh Correlation
e is highly correlated with n and 5 other fieldsHigh Correlation
b is highly correlated with loc and 8 other fieldsHigh Correlation
t is highly correlated with n and 5 other fieldsHigh Correlation
lOCode is highly correlated with loc and 9 other fieldsHigh Correlation
uniq_Op is highly correlated with dHigh Correlation
d is highly correlated with uniq_OpHigh Correlation
uniq_Opnd is highly correlated with loc and 7 other fieldsHigh Correlation
i is highly correlated with uniq_OpndHigh Correlation
total_Op is highly correlated with loc and 11 other fieldsHigh Correlation
total_Opnd is highly correlated with loc and 10 other fieldsHigh Correlation
branchCount is highly correlated with loc and 7 other fieldsHigh Correlation
n has 55 (2.6%) zeros Zeros
v has 158 (7.5%) zeros Zeros
l has 167 (7.9%) zeros Zeros
d has 167 (7.9%) zeros Zeros
i has 167 (7.9%) zeros Zeros
e has 167 (7.9%) zeros Zeros
b has 701 (33.2%) zeros Zeros
t has 167 (7.9%) zeros Zeros
lOCode has 536 (25.4%) zeros Zeros
lOComment has 1618 (76.7%) zeros Zeros
lOBlank has 1219 (57.8%) zeros Zeros
locCodeAndComment has 1982 (94.0%) zeros Zeros
uniq_Op has 59 (2.8%) zeros Zeros
uniq_Opnd has 163 (7.7%) zeros Zeros
total_Op has 59 (2.8%) zeros Zeros
total_Opnd has 163 (7.7%) zeros Zeros

Variables

loc
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count139
Unique (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.37226174
Minimum1
Maximum288
Zeros0
Zeros (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median9
Q324
95-th percentile76.6
Maximum288
Range287
Interquartile range (IQR)21

Descriptive statistics

Standard deviation29.75444211
Coefficient of variation (CV)1.460537003
Kurtosis16.38213844
Mean20.37226174
Median Absolute Deviation (MAD)19.15554877
Skewness3.352322323
Sum42965.1
Variance885.3268251
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.05 1.55 2.5 3.5 ... 49.5 65.5 94. 154. 288. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2 279 13.2%
 
4 242 11.5%
 
1 148 7.0%
 
3 102 4.8%
 
6 80 3.8%
 
5 75 3.6%
 
15 53 2.5%
 
9 51 2.4%
 
7 49 2.3%
 
16 47 2.2%
 
Other values (129) 983 46.6%
 
ValueCountFrequency (%) 
1 148 7.0%
 
1.1 1 < 0.1%
 
2 279 13.2%
 
3 102 4.8%
 
4 242 11.5%
 
ValueCountFrequency (%) 
288 1 < 0.1%
 
286 1 < 0.1%
 
283 1 < 0.1%
 
220 1 < 0.1%
 
217 1 < 0.1%
 

v(g)
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count31
Unique (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.838027501
Minimum1
Maximum45
Zeros0
Zeros (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile11
Maximum45
Range44
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.900763208
Coefficient of variation (CV)1.374462794
Kurtosis19.05927477
Mean2.838027501
Median Absolute Deviation (MAD)2.375093039
Skewness3.737101379
Sum5985.4
Variance15.2159536
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.2 1.7 2.5 3.5 ... 7.5 13.5 19.5 28.5 45. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 1236 58.6%
 
2 276 13.1%
 
3 156 7.4%
 
4 95 4.5%
 
5 86 4.1%
 
7 43 2.0%
 
6 42 2.0%
 
9 25 1.2%
 
8 24 1.1%
 
11 18 0.9%
 
Other values (21) 108 5.1%
 
ValueCountFrequency (%) 
1 1236 58.6%
 
1.4 1 < 0.1%
 
2 276 13.1%
 
3 156 7.4%
 
4 95 4.5%
 
ValueCountFrequency (%) 
45 1 < 0.1%
 
34 1 < 0.1%
 
29 1 < 0.1%
 
28 2 0.1%
 
27 3 0.1%
 

ev(g)
Real number (ℝ≥0)

Distinct count21
Unique (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.674442864
Minimum1
Maximum26
Zeros0
Zeros (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile6
Maximum26
Range25
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.200658914
Coefficient of variation (CV)1.314263365
Kurtosis27.35303828
Mean1.674442864
Median Absolute Deviation (MAD)1.167503717
Skewness4.632831818
Sum3531.4
Variance4.842899654
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.2 2.2 3.5 8.5 13.5 26. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 1825 86.5%
 
3 101 4.8%
 
5 40 1.9%
 
4 27 1.3%
 
7 26 1.2%
 
8 22 1.0%
 
6 15 0.7%
 
11 11 0.5%
 
10 11 0.5%
 
9 8 0.4%
 
Other values (11) 23 1.1%
 
ValueCountFrequency (%) 
1 1825 86.5%
 
1.4 1 < 0.1%
 
3 101 4.8%
 
4 27 1.3%
 
5 40 1.9%
 
ValueCountFrequency (%) 
26 1 < 0.1%
 
22 1 < 0.1%
 
21 1 < 0.1%
 
19 2 0.1%
 
18 1 < 0.1%
 

iv(g)
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count26
Unique (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.546420104
Minimum1
Maximum45
Zeros0
Zeros (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile9.6
Maximum45
Range44
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.37585926
Coefficient of variation (CV)1.32572754
Kurtosis24.35597406
Mean2.546420104
Median Absolute Deviation (MAD)2.037439131
Skewness4.018701298
Sum5370.4
Variance11.39642575
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.2 1.7 2.5 3.5 5.5 7.5 13.5 18.5 45. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 1290 61.2%
 
2 279 13.2%
 
3 155 7.3%
 
4 98 4.6%
 
5 68 3.2%
 
6 45 2.1%
 
7 30 1.4%
 
8 19 0.9%
 
9 18 0.9%
 
11 17 0.8%
 
Other values (16) 90 4.3%
 
ValueCountFrequency (%) 
1 1290 61.2%
 
1.4 1 < 0.1%
 
2 279 13.2%
 
3 155 7.3%
 
4 98 4.6%
 
ValueCountFrequency (%) 
45 1 < 0.1%
 
29 2 0.1%
 
27 1 < 0.1%
 
25 3 0.1%
 
22 1 < 0.1%
 

n
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count278
Unique (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.82944523
Minimum0
Maximum1106
Zeros55
Zeros (%)2.6%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q14
median16
Q358
95-th percentile211.6
Maximum1106
Range1106
Interquartile range (IQR)54

Descriptive statistics

Standard deviation83.5998742
Coefficient of variation (CV)1.677720348
Kurtosis22.40960609
Mean49.82944523
Median Absolute Deviation (MAD)53.22584224
Skewness3.697793058
Sum105090.3
Variance6988.938967
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 1.15 2.5 3.5 4.5 ... 157. 219.5 364. 529.5 1106. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4 345 16.4%
 
5 140 6.6%
 
1 103 4.9%
 
9 61 2.9%
 
0 55 2.6%
 
7 50 2.4%
 
3 49 2.3%
 
8 36 1.7%
 
10 34 1.6%
 
6 34 1.6%
 
Other values (268) 1202 57.0%
 
ValueCountFrequency (%) 
0 55 2.6%
 
1 103 4.9%
 
1.3 1 < 0.1%
 
2 9 0.4%
 
3 49 2.3%
 
ValueCountFrequency (%) 
1106 1 < 0.1%
 
794 1 < 0.1%
 
606 1 < 0.1%
 
532 1 < 0.1%
 
527 1 < 0.1%
 

v
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count729
Unique (%)34.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean258.6967188
Minimum0
Maximum7918.82
Zeros158
Zeros (%)7.5%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q18
median57.06
Q3265.93
95-th percentile1190.032
Maximum7918.82
Range7918.82
Interquartile range (IQR)257.93

Descriptive statistics

Standard deviation516.3176046
Coefficient of variation (CV)1.995841335
Kurtosis36.03353859
Mean258.6967188
Median Absolute Deviation (MAD)305.8637163
Skewness4.581499225
Sum545591.38
Variance266583.8688
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000000e+00 5.000000e-01 3.375000e+00 5.545000e+00 7.170000e+00 ... 8.222650e+02 1.315230e+03 2.311205e+03 3.420875e+03 7.918820e+03], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
8 344 16.3%
 
0 158 7.5%
 
11.61 138 6.5%
 
4.75 49 2.3%
 
19.65 34 1.6%
 
15.51 33 1.6%
 
27 29 1.4%
 
31.7 28 1.3%
 
24 23 1.1%
 
28.53 20 0.9%
 
Other values (719) 1253 59.4%
 
ValueCountFrequency (%) 
0 158 7.5%
 
1 1 < 0.1%
 
1.3 1 < 0.1%
 
2 8 0.4%
 
4.75 49 2.3%
 
ValueCountFrequency (%) 
7918.82 1 < 0.1%
 
5228.46 1 < 0.1%
 
3820.09 1 < 0.1%
 
3493.67 1 < 0.1%
 
3348.08 1 < 0.1%
 

l
Real number (ℝ≥0)

ZEROS
Distinct count52
Unique (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3195827406
Minimum0
Maximum2
Zeros167
Zeros (%)7.9%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10.08
median0.2
Q30.67
95-th percentile0.67
Maximum2
Range2
Interquartile range (IQR)0.59

Descriptive statistics

Standard deviation0.3170290459
Coefficient of variation (CV)0.9920092846
Kurtosis5.613137251
Mean0.3195827406
Median Absolute Deviation (MAD)0.2505706425
Skewness1.793769292
Sum674
Variance0.1005074159
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.01 0.025 0.115 0.185 ... 0.79 0.955 1.15 1.8 2. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.67 434 20.6%
 
0 167 7.9%
 
0.4 108 5.1%
 
0.5 84 4.0%
 
0.33 83 3.9%
 
0.06 79 3.7%
 
0.07 76 3.6%
 
1 73 3.5%
 
0.05 70 3.3%
 
0.11 62 2.9%
 
Other values (42) 873 41.4%
 
ValueCountFrequency (%) 
0 167 7.9%
 
0.02 16 0.8%
 
0.03 42 2.0%
 
0.04 56 2.7%
 
0.05 70 3.3%
 
ValueCountFrequency (%) 
2 19 0.9%
 
1.6 1 < 0.1%
 
1.33 1 < 0.1%
 
1.3 1 < 0.1%
 
1 73 3.5%
 

d
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count548
Unique (%)26.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.771242295
Minimum0
Maximum53.75
Zeros167
Zeros (%)7.9%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11.5
median3.5
Q39.2
95-th percentile22.862
Maximum53.75
Range53.75
Interquartile range (IQR)7.7

Descriptive statistics

Standard deviation7.863645549
Coefficient of variation (CV)1.161329813
Kurtosis5.597748822
Mean6.771242295
Median Absolute Deviation (MAD)5.704701668
Skewness2.13936412
Sum14280.55
Variance61.83692132
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.25 0.565 0.875 1.05 ... 10.105 16.525 20.145 31.375 53.75 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1.5 434 20.6%
 
0 167 7.9%
 
2.5 108 5.1%
 
2 84 4.0%
 
3 83 3.9%
 
1 73 3.5%
 
3.5 48 2.3%
 
6 31 1.5%
 
4.5 29 1.4%
 
5 22 1.0%
 
Other values (538) 1030 48.8%
 
ValueCountFrequency (%) 
0 167 7.9%
 
0.5 19 0.9%
 
0.63 1 < 0.1%
 
0.75 1 < 0.1%
 
1 73 3.5%
 
ValueCountFrequency (%) 
53.75 1 < 0.1%
 
51.24 1 < 0.1%
 
49.38 1 < 0.1%
 
49.32 1 < 0.1%
 
48.32 1 < 0.1%
 

i
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count893
Unique (%)42.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21.24007112
Minimum0
Maximum193.06
Zeros167
Zeros (%)7.9%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q15.33
median14.4
Q329.85
95-th percentile61.904
Maximum193.06
Range193.06
Interquartile range (IQR)24.52

Descriptive statistics

Standard deviation21.50036691
Coefficient of variation (CV)1.01225494
Kurtosis7.322907801
Mean21.24007112
Median Absolute Deviation (MAD)15.90230137
Skewness2.131496367
Sum44795.31
Variance462.2657774
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 3.74 4.595 4.91 ... 46.7 61.435 81.785 107.295 193.06 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5.33 319 15.1%
 
0 167 7.9%
 
7.74 100 4.7%
 
4.75 44 2.1%
 
5.8 34 1.6%
 
8 24 1.1%
 
9.51 21 1.0%
 
12.68 21 1.0%
 
7.86 21 1.0%
 
7.75 18 0.9%
 
Other values (883) 1340 63.5%
 
ValueCountFrequency (%) 
0 167 7.9%
 
1 1 < 0.1%
 
1.3 1 < 0.1%
 
2.77 1 < 0.1%
 
3.62 2 0.1%
 
ValueCountFrequency (%) 
193.06 1 < 0.1%
 
166.37 1 < 0.1%
 
165.42 1 < 0.1%
 
154.03 1 < 0.1%
 
143.73 1 < 0.1%
 

e
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count961
Unique (%)45.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5242.38624
Minimum0
Maximum324803.51
Zeros167
Zeros (%)7.9%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q112
median213.97
Q32276.02
95-th percentile24831.498
Maximum324803.51
Range324803.51
Interquartile range (IQR)2264.02

Descriptive statistics

Standard deviation17444.98121
Coefficient of variation (CV)3.327679498
Kurtosis89.19927672
Mean5242.38624
Median Absolute Deviation (MAD)7619.160428
Skewness7.69911315
Sum11056192.58
Variance304327369.5
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.00000000e+00 5.00000000e-01 4.37500000e+00 5.27500000e+00 7.87500000e+00 ... 1.47024600e+04 2.28208100e+04 5.49366350e+04 1.28400385e+05 3.24803510e+05], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
12 319 15.1%
 
0 167 7.9%
 
17.41 100 4.7%
 
4.75 44 2.1%
 
23.22 34 1.6%
 
49.13 21 1.0%
 
79.25 21 1.0%
 
8 20 0.9%
 
31.02 18 0.9%
 
60 17 0.8%
 
Other values (951) 1348 63.9%
 
ValueCountFrequency (%) 
0 167 7.9%
 
1 8 0.4%
 
1.3 1 < 0.1%
 
2.38 5 0.2%
 
4 2 0.1%
 
ValueCountFrequency (%) 
324803.51 1 < 0.1%
 
234743.54 1 < 0.1%
 
176429.01 1 < 0.1%
 
159215.89 1 < 0.1%
 
156823.6 1 < 0.1%
 

b
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count92
Unique (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08673779042
Minimum0
Maximum2.64
Zeros701
Zeros (%)33.2%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.02
Q30.09
95-th percentile0.4
Maximum2.64
Range2.64
Interquartile range (IQR)0.09

Descriptive statistics

Standard deviation0.1755065264
Coefficient of variation (CV)2.023414771
Kurtosis34.44000481
Mean0.08673779042
Median Absolute Deviation (MAD)0.1037411747
Skewness4.529341431
Sum182.93
Variance0.0308025408
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.005 0.015 0.035 0.055 ... 0.275 0.375 0.685 1.14 2.64 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 701 33.2%
 
0.01 295 14.0%
 
0.02 136 6.4%
 
0.03 129 6.1%
 
0.04 94 4.5%
 
0.05 74 3.5%
 
0.06 55 2.6%
 
0.08 48 2.3%
 
0.1 39 1.8%
 
0.07 39 1.8%
 
Other values (82) 499 23.7%
 
ValueCountFrequency (%) 
0 701 33.2%
 
0.01 295 14.0%
 
0.02 136 6.4%
 
0.03 129 6.1%
 
0.04 94 4.5%
 
ValueCountFrequency (%) 
2.64 1 < 0.1%
 
1.74 1 < 0.1%
 
1.3 1 < 0.1%
 
1.27 1 < 0.1%
 
1.16 1 < 0.1%
 

t
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count947
Unique (%)44.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291.2450403
Minimum0
Maximum18044.64
Zeros167
Zeros (%)7.9%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10.67
median11.89
Q3126.45
95-th percentile1379.526
Maximum18044.64
Range18044.64
Interquartile range (IQR)125.78

Descriptive statistics

Standard deviation969.1651603
Coefficient of variation (CV)3.327662367
Kurtosis89.19941613
Mean291.2450403
Median Absolute Deviation (MAD)423.2862068
Skewness7.699120009
Sum614235.79
Variance939281.1079
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000000e+00 3.000000e-02 2.400000e-01 2.900000e-01 4.350000e-01 ... 8.168000e+02 1.267820e+03 3.052035e+03 7.133355e+03 1.804464e+04], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.67 319 15.1%
 
0 167 7.9%
 
0.97 100 4.7%
 
0.26 44 2.1%
 
1.29 36 1.7%
 
2.73 21 1.0%
 
4.4 21 1.0%
 
0.44 20 0.9%
 
1.72 18 0.9%
 
3.33 17 0.8%
 
Other values (937) 1346 63.8%
 
ValueCountFrequency (%) 
0 167 7.9%
 
0.06 7 0.3%
 
0.13 5 0.2%
 
0.22 2 0.1%
 
0.26 44 2.1%
 
ValueCountFrequency (%) 
18044.64 1 < 0.1%
 
13041.31 1 < 0.1%
 
9801.61 1 < 0.1%
 
8845.33 1 < 0.1%
 
8712.42 1 < 0.1%
 

lOCode
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count121
Unique (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.52536747
Minimum0
Maximum262
Zeros536
Zeros (%)25.4%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median5
Q317
95-th percentile60
Maximum262
Range262
Interquartile range (IQR)17

Descriptive statistics

Standard deviation24.18830199
Coefficient of variation (CV)1.665245443
Kurtosis17.84364165
Mean14.52536747
Median Absolute Deviation (MAD)15.64540913
Skewness3.416810845
Sum30634
Variance585.073953
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 6.5 ... 38.5 61.5 97.5 143. 262. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 536 25.4%
 
2 203 9.6%
 
1 125 5.9%
 
3 83 3.9%
 
6 77 3.7%
 
4 69 3.3%
 
5 61 2.9%
 
8 60 2.8%
 
12 55 2.6%
 
9 53 2.5%
 
Other values (111) 787 37.3%
 
ValueCountFrequency (%) 
0 536 25.4%
 
1 125 5.9%
 
2 203 9.6%
 
3 83 3.9%
 
4 69 3.3%
 
ValueCountFrequency (%) 
262 1 < 0.1%
 
251 1 < 0.1%
 
198 1 < 0.1%
 
179 1 < 0.1%
 
173 1 < 0.1%
 

lOComment
Real number (ℝ≥0)

ZEROS
Distinct count28
Unique (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9459459459
Minimum0
Maximum44
Zeros1618
Zeros (%)76.7%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5
Maximum44
Range44
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.085271204
Coefficient of variation (CV)3.261572416
Kurtosis60.51908279
Mean0.9459459459
Median Absolute Deviation (MAD)1.451437212
Skewness6.569980369
Sum1995
Variance9.518898405
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 4.5 6.5 11.5 26.5 44. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1618 76.7%
 
1 181 8.6%
 
2 92 4.4%
 
3 53 2.5%
 
4 41 1.9%
 
5 27 1.3%
 
6 18 0.9%
 
8 11 0.5%
 
11 11 0.5%
 
10 10 0.5%
 
Other values (18) 47 2.2%
 
ValueCountFrequency (%) 
0 1618 76.7%
 
1 181 8.6%
 
2 92 4.4%
 
3 53 2.5%
 
4 41 1.9%
 
ValueCountFrequency (%) 
44 2 0.1%
 
35 1 < 0.1%
 
27 1 < 0.1%
 
26 2 0.1%
 
24 2 0.1%
 

lOBlank
Real number (ℝ≥0)

ZEROS
Distinct count31
Unique (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.759601707
Minimum0
Maximum58
Zeros1219
Zeros (%)57.8%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile9
Maximum58
Range58
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.856850166
Coefficient of variation (CV)2.19188817
Kurtosis37.02712307
Mean1.759601707
Median Absolute Deviation (MAD)2.217783704
Skewness4.780010864
Sum3711
Variance14.8752932
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 2.5 3.5 6.5 13.5 24.5 58. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1219 57.8%
 
1 255 12.1%
 
2 209 9.9%
 
3 127 6.0%
 
4 58 2.8%
 
5 49 2.3%
 
6 44 2.1%
 
7 24 1.1%
 
9 17 0.8%
 
8 15 0.7%
 
Other values (21) 92 4.4%
 
ValueCountFrequency (%) 
0 1219 57.8%
 
1 255 12.1%
 
2 209 9.9%
 
3 127 6.0%
 
4 58 2.8%
 
ValueCountFrequency (%) 
58 1 < 0.1%
 
35 1 < 0.1%
 
34 1 < 0.1%
 
32 1 < 0.1%
 
28 3 0.1%
 

locCodeAndComment
Real number (ℝ≥0)

ZEROS
Distinct count12
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1327643433
Minimum0
Maximum12
Zeros1982
Zeros (%)94.0%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum12
Range12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7040227269
Coefficient of variation (CV)5.302799753
Kurtosis103.0790452
Mean0.1327643433
Median Absolute Deviation (MAD)0.2495390502
Skewness8.78903415
Sum280
Variance0.4956479999
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 2.5 5.5 12. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1982 94.0%
 
1 59 2.8%
 
2 37 1.8%
 
3 14 0.7%
 
5 5 0.2%
 
4 5 0.2%
 
7 2 0.1%
 
11 1 < 0.1%
 
9 1 < 0.1%
 
12 1 < 0.1%
 
Other values (2) 2 0.1%
 
ValueCountFrequency (%) 
0 1982 94.0%
 
1 59 2.8%
 
2 37 1.8%
 
3 14 0.7%
 
4 5 0.2%
 
ValueCountFrequency (%) 
12 1 < 0.1%
 
11 1 < 0.1%
 
9 1 < 0.1%
 
8 1 < 0.1%
 
7 2 0.1%
 

uniq_Op
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count34
Unique (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.631673779
Minimum0
Maximum37
Zeros59
Zeros (%)2.8%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q311
95-th percentile19
Maximum37
Range37
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.730346504
Coefficient of variation (CV)0.7508636598
Kurtosis1.241395021
Mean7.631673779
Median Absolute Deviation (MAD)4.550690902
Skewness1.14076585
Sum16095.2
Variance32.83687106
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.1 1.6 2.5 ... 14.5 18.5 22.5 28.5 37. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
3 442 21.0%
 
8 151 7.2%
 
5 145 6.9%
 
6 131 6.2%
 
1 122 5.8%
 
7 122 5.8%
 
9 111 5.3%
 
4 103 4.9%
 
10 90 4.3%
 
12 78 3.7%
 
Other values (24) 614 29.1%
 
ValueCountFrequency (%) 
0 59 2.8%
 
1 122 5.8%
 
1.2 1 < 0.1%
 
2 78 3.7%
 
3 442 21.0%
 
ValueCountFrequency (%) 
37 1 < 0.1%
 
31 2 0.1%
 
30 2 0.1%
 
29 1 < 0.1%
 
28 6 0.3%
 

uniq_Opnd
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count73
Unique (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.537316264
Minimum0
Maximum120
Zeros163
Zeros (%)7.7%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median5
Q313
95-th percentile35
Maximum120
Range120
Interquartile range (IQR)12

Descriptive statistics

Standard deviation12.19572656
Coefficient of variation (CV)1.278737773
Kurtosis8.187672852
Mean9.537316264
Median Absolute Deviation (MAD)8.72374778
Skewness2.387301069
Sum20114.2
Variance148.7357463
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.1 1.6 2.5 ... 25.5 44.5 59.5 74.5 120. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 437 20.7%
 
2 196 9.3%
 
0 163 7.7%
 
6 110 5.2%
 
3 109 5.2%
 
4 108 5.1%
 
5 94 4.5%
 
8 72 3.4%
 
7 65 3.1%
 
10 65 3.1%
 
Other values (63) 690 32.7%
 
ValueCountFrequency (%) 
0 163 7.7%
 
1 437 20.7%
 
1.2 1 < 0.1%
 
2 196 9.3%
 
3 109 5.2%
 
ValueCountFrequency (%) 
120 1 < 0.1%
 
84 1 < 0.1%
 
75 1 < 0.1%
 
74 1 < 0.1%
 
73 1 < 0.1%
 

total_Op
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count207
Unique (%)9.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.0437174
Minimum0
Maximum678
Zeros59
Zeros (%)2.8%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median10
Q336
95-th percentile129.6
Maximum678
Range678
Interquartile range (IQR)33

Descriptive statistics

Standard deviation51.77605592
Coefficient of variation (CV)1.667843295
Kurtosis22.42277671
Mean31.0437174
Median Absolute Deviation (MAD)32.99444657
Skewness3.698032058
Sum65471.2
Variance2680.759966
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 5.000e-01 1.100e+00 1.600e+00 2.500e+00 ... 1.175e+02 1.365e+02 2.100e+02 3.375e+02 6.780e+02], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
3 430 20.4%
 
1 120 5.7%
 
4 78 3.7%
 
5 74 3.5%
 
2 69 3.3%
 
6 66 3.1%
 
7 64 3.0%
 
0 59 2.8%
 
9 43 2.0%
 
12 41 1.9%
 
Other values (197) 1065 50.5%
 
ValueCountFrequency (%) 
0 59 2.8%
 
1 120 5.7%
 
1.2 1 < 0.1%
 
2 69 3.3%
 
3 430 20.4%
 
ValueCountFrequency (%) 
678 1 < 0.1%
 
509 1 < 0.1%
 
398 1 < 0.1%
 
345 1 < 0.1%
 
330 1 < 0.1%
 

total_Opnd
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS
Distinct count153
Unique (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.78672357
Minimum0
Maximum428
Zeros163
Zeros (%)7.7%
Memory size16.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median6
Q322
95-th percentile80.6
Maximum428
Range428
Interquartile range (IQR)21

Descriptive statistics

Standard deviation32.07439842
Coefficient of variation (CV)1.7072907
Kurtosis22.45636693
Mean18.78672357
Median Absolute Deviation (MAD)20.29876375
Skewness3.716465533
Sum39621.2
Variance1028.767034
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.1 1.6 2.5 ... 65.5 106.5 142.5 213.5 428. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 426 20.2%
 
2 188 8.9%
 
0 163 7.7%
 
3 93 4.4%
 
4 92 4.4%
 
10 68 3.2%
 
5 63 3.0%
 
6 54 2.6%
 
8 53 2.5%
 
7 50 2.4%
 
Other values (143) 859 40.7%
 
ValueCountFrequency (%) 
0 163 7.7%
 
1 426 20.2%
 
1.2 1 < 0.1%
 
2 188 8.9%
 
3 93 4.4%
 
ValueCountFrequency (%) 
428 1 < 0.1%
 
285 1 < 0.1%
 
215 1 < 0.1%
 
212 1 < 0.1%
 
208 1 < 0.1%
 

branchCount
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count44
Unique (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.665908013
Minimum1
Maximum89
Zeros0
Zeros (%)0.0%
Memory size16.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q35
95-th percentile21
Maximum89
Range88
Interquartile range (IQR)4

Descriptive statistics

Standard deviation7.792206452
Coefficient of variation (CV)1.67003002
Kurtosis19.17806457
Mean4.665908013
Median Absolute Deviation (MAD)4.723363687
Skewness3.755886666
Sum9840.4
Variance60.7184814
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.2 2.2 3.5 4.5 ... 17.5 25.5 38. 56. 89. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 1233 58.5%
 
3 273 12.9%
 
5 158 7.5%
 
7 90 4.3%
 
9 71 3.4%
 
11 39 1.8%
 
13 37 1.8%
 
15 23 1.1%
 
17 23 1.1%
 
21 19 0.9%
 
Other values (34) 143 6.8%
 
ValueCountFrequency (%) 
1 1233 58.5%
 
1.4 1 < 0.1%
 
3 273 12.9%
 
4 4 0.2%
 
5 158 7.5%
 
ValueCountFrequency (%) 
89 1 < 0.1%
 
67 1 < 0.1%
 
57 1 < 0.1%
 
55 1 < 0.1%
 
54 1 < 0.1%
 

defects
Boolean

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
False
1783
True
 
326
ValueCountFrequency (%) 
False 1783 84.5%
 
True 326 15.5%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

locv(g)ev(g)iv(g)nvldiebtlOCodelOCommentlOBlanklocCodeAndCommentuniq_Opuniq_Opndtotal_Optotal_OpndbranchCountdefects
01.11.41.41.41.31.301.301.301.301.301.301.3022221.21.21.21.21.4False
11.01.01.01.01.01.001.001.001.001.001.001.0011111.01.01.01.01.0True
283.011.01.011.0171.0927.890.0423.0440.2721378.610.311187.7065106018.025.0107.064.021.0True
346.08.06.08.0141.0769.780.0714.8651.8111436.730.26635.373725016.028.089.052.015.0True
425.03.01.03.058.0254.750.119.3527.252381.950.08132.332102011.010.041.017.05.0True
543.03.01.03.0115.0569.730.0911.2750.536423.730.19356.873524011.020.074.041.05.0True
648.06.01.06.0149.0751.610.0615.4348.7211596.340.25644.244122012.021.095.054.011.0True
769.012.01.012.0231.01212.270.0427.2744.4533061.940.401836.776232016.022.0156.075.023.0True
847.06.01.06.0149.0745.000.0616.2045.9912069.000.25670.504121012.020.095.054.011.0True
948.07.01.07.0155.0801.340.0617.8244.9714278.390.27793.244221014.022.099.056.013.0True

Last rows

locv(g)ev(g)iv(g)nvldiebtlOCodelOCommentlOBlanklocCodeAndCommentuniq_Opuniq_Opndtotal_Optotal_OpndbranchCountdefects
209914.02.01.02.034.0141.780.214.8029.54680.530.0537.81120008.010.022.012.03.0False
210014.02.01.02.031.0121.110.156.8617.66830.490.0446.1490108.07.019.012.03.0False
21012.01.01.01.08.022.460.502.0011.2344.920.012.5000004.03.05.03.01.0False
210214.02.01.02.028.0109.390.195.1421.27562.590.0431.26110108.07.019.09.03.0False
21034.01.01.01.05.011.610.671.507.7417.410.000.9700003.02.03.02.01.0False
210419.02.01.02.040.0175.690.156.8225.771197.900.0666.551212010.011.025.015.03.0False
210523.03.03.03.060.0278.630.109.6928.752700.580.09150.031812012.013.039.021.05.0False
21062.01.01.01.04.08.000.671.505.3312.000.000.6700003.01.03.01.01.0False
210713.01.01.01.017.060.940.254.0015.24243.780.0213.5460506.06.09.08.01.0False
210811.02.01.02.027.0102.800.176.0017.13616.790.0334.2790008.06.018.09.03.0False